Saturation arithmetic is a version of arithmetic in which all operations such as addition and multiplication are limited to a fixed range between a minimum and maximum value. If the result of an operation is greater than the maximum it is set ("clamped") to the maximum, while if it is below the minimum it is clamped to the minimum. The name comes from how the value becomes "saturated" once it reaches the extreme values; further additions to a maximum or subtractions from a minimum will not change the result.
For example, if the valid range of values is from -100 to 100, the following operations produce the following values:
As can be seen from these examples, familiar properties like associativity and distributivity fail in saturation arithmetic. This makes it unpleasant to deal with in abstract mathematics, but it has an important role to play in digital hardware and algorithms.
Typically, early computer microprocessors did not implement integer arithmetic operations using saturation arithmetic; instead, they used the easier-to-implement modular arithmetic, in which values exceeding the maximum value "wrap around" to the minimum value, like the hours on a clock passing from 12 to 1. In hardware, modular arithmetic with a minimum of zero and a maximum of 2n can be implemented by simply discarding all but the lowest n bits.
However, although more difficult to implement, saturation arithmetic has numerous practical advantages. The result is as numerically close to the true answer as possible; it's considerably less surprising to get an answer of 127 instead of 130 than to get an answer of −126 instead of 130. It also enables overflow of additions and multiplications to be detected consistently without an overflow bit or excessive computation by simple comparison with the maximum or minimum value (provided the datum is not permitted to take on these values).
Additionally, saturation arithmetic enables efficient algorithms for many problems, particularly in digital signal processing. For example, adjusting the volume level of a sound signal can result in overflow, and saturation causes significantly less distortion to the sound than wrap-around. In the words of researchers G. A. Constantinides et al.:
When adding two numbers using two’s complement representation, overflow results in a ‘wrap-around’ phenomenon. The result can be a catastrophic loss in signal-to-noise ratio in a DSP system. Signals in DSP designs are therefore usually either scaled appropriately to avoid overflow for all but the most extreme input vectors, or produced using saturation arithmetic components.[1]
Saturation arithmetic operations are available on many modern platforms, and in particular was one of the extensions made by the Intel MMX platform, specifically for such signal processing applications.
Saturation arithmetic for integers has also been implemented in software for a number of programming languages including C, C++, Eiffel. This helps programmers anticipate and understand the effects of overflow better. On the other hand, saturation is challenging to implement efficiently in software on a machine with only modular arithmetic operations, since simple implementations require branches that create huge pipeline delays.
Although saturation arithmetic is less popular for integer arithmetic in hardware, the IEEE floating-point standard, the most popular abstraction for dealing with approximate real numbers, uses a form of saturation in which overflow is converted into "infinity" or "negative infinity", and any other operation on this result continues to produce the same value. This has the advantage over simple saturation that later operations which decrease the value will not end up producing a "reasonable" result, such as in the computation .